feat: better search#1550
Conversation
WalkthroughThis PR implements a pluggable search engine architecture supporting database (default) and Meilisearch backends. The database engine uses dialect-aware SQL predicates for accent- and case-insensitive matching. Meilisearch adds typo tolerance and auto-indexing with debounced mutation coalescing. Entity queries now support ChangesSearch Engine Infrastructure & Integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
|
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 37.74% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. | |
| Title check | ❓ Inconclusive | The title 'feat: better search' is vague and generic, using non-descriptive language that doesn't clearly convey the specific implementation details (tokenization, Meilisearch support, database engine abstraction). | Consider a more specific title like 'feat: add tokenized search with Meilisearch support' or 'feat: implement pluggable search engine abstraction with Meilisearch integration'. |
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Linked Issues check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Out of Scope Changes check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Description check | ✅ Passed | PR description covers required sections: type (feature), what it does (tokenized search + Meilisearch), related issues (#1021, #438), and reviewer notes. All mandatory elements present. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
✨ Simplify code
- Create PR with simplified code
- Commit simplified code in branch
mk/better-search
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (3)
backend/internal/sys/config/conf_search_test.go (1)
30-30: Fix:sentinelexists in this test package, soconf_search_test.gocompiles.
backend/internal/sys/config/conf_search_test.gousessentinel, which is declared at package scope inbackend/internal/sys/config/conf_redact_test.goasconst sentinel = redactedValue, making it available to otherpackage configtest files.Security: keep the redaction sentinel stable/non-plausible (
redactedValue) so real secrets can’t accidentally equal the marker.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/internal/sys/config/conf_search_test.go` at line 30, The test in conf_search_test.go calls assert.Contains(t, string(out), sentinel) but sentinel must be defined at package scope; declare const sentinel = redactedValue in the package test file conf_redact_test.go (where other tests share package config) so conf_search_test.go compiles, and ensure redactedValue remains a stable, non-plausible marker to avoid matching real secrets.Taskfile.yml (1)
118-118: ⚡ Quick winBind the test Meilisearch port to loopback only.
Publishing on all interfaces is broader than needed for local integration tests; bind to
127.0.0.1to reduce exposure on shared runners/dev hosts.Suggested fix
- - docker run -d --rm --name homebox-meili-test -p 7711:7700 -e MEILI_MASTER_KEY=test-master-key -e MEILI_NO_ANALYTICS=true getmeili/meilisearch:v1.22 + - docker run -d --rm --name homebox-meili-test -p 127.0.0.1:7711:7700 -e MEILI_MASTER_KEY=test-master-key -e MEILI_NO_ANALYTICS=true getmeili/meilisearch:v1.22🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Taskfile.yml` at line 118, The docker run command currently publishes Meilisearch on all interfaces (-p 7711:7700); restrict it to loopback by changing the port mapping to -p 127.0.0.1:7711:7700 in the docker run line (the entry starting with "docker run -d --rm --name homebox-meili-test ... getmeili/meilisearch:v1.22") so the container only binds to localhost for test runs.docs/src/content/docs/en/quick-start/configure/search.mdx (1)
42-43: ⚡ Quick winPrefer scoped Meilisearch keys in examples, not the master key.
Using
your_master_keyin sample config normalizes over-privileged credentials. Recommend showing a scoped key as the default example and reserving master-key usage for setup/admin notes.Suggested doc tweak
- - HBOX_SEARCH_MEILISEARCH_API_KEY=your_master_key + - HBOX_SEARCH_MEILISEARCH_API_KEY=your_homebox_rw_key @@ - - MEILI_MASTER_KEY=your_master_key + - MEILI_MASTER_KEY=your_bootstrap_master_key @@ -| HBOX_SEARCH_MEILISEARCH_API_KEY | | Meilisearch API key (the master key, or a key with index read/write access) | +| HBOX_SEARCH_MEILISEARCH_API_KEY | | Meilisearch API key (prefer a scoped key with index read/write access; avoid using the master key for app runtime) |Also applies to: 49-50, 62-63
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/src/content/docs/en/quick-start/configure/search.mdx` around lines 42 - 43, Replace the example value for the HBOX_SEARCH_MEILISEARCH_API_KEY env var so it shows a scoped/search-only key (e.g., "your_scoped_search_key" or "scoped_search_key") instead of "your_master_key" in all examples, and add a brief note nearby that the master key should only be used for setup/admin operations; update every occurrence of HBOX_SEARCH_MEILISEARCH_API_KEY in the file to use the scoped-key example and keep master-key usage documented only in an admin/setup note.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/internal/data/search/database.go`:
- Around line 133-155: The code uses sync.Once (e.unaccentOnce) in
unaccentAvailable which can permanently cache a transient context-cancel/timeout
failure; replace the sync.Once approach with an explicit lock + checked flag so
transient ctx errors don't mark the probe as finished: add (or use) a mutex
(e.g., unaccentMu) and a bool (e.g., unaccentChecked) on DatabaseEngine, then in
unaccentAvailable acquire the mutex, if unaccentChecked return e.unaccent; run
the probe using the provided ctx, but if the probe fails due to ctx
cancellation/deadline (ctx.Err()==context.Canceled or context.DeadlineExceeded)
do not set unaccentChecked and return (so future calls will retry); only set
e.unaccent and unaccentChecked=true when the probe completes without a
caller-context cancellation (successful probe or definitive DB response).
Reference: unaccentAvailable, e.unaccentOnce, e.unaccent (replace e.unaccentOnce
use with unaccentMu + unaccentChecked).
In `@backend/internal/data/search/meilisearch.go`:
- Around line 357-397: The reindex loop builds `indexed` from multiple
offset-paged queries which is not a stable snapshot and can miss concurrent
mutations; to fix, make `reindex` first capture a stable list of target entity
IDs (e.g., call the same query but use .IDs(ctx) or a single transaction to
collect all entity IDs matching the filters) and then iterate that static slice
in batches (using meiliReindexBatch) to load full entities, call
buildMeiliDocument, add documents with e.index.AddDocumentsWithContext and
populate the `indexed` set from the static ID list; finally call pruneStale with
that `indexed` set so pruning is based on the immutable snapshot rather than
offset-paged, concurrently-changing results.
In `@backend/internal/sys/config/conf_search_test.go`:
- Around line 12-21: The test Test_SearchConf_Defaults reads environment
variables via conf.Parse so make it deterministic by clearing the relevant
HBOXTEST_* env keys before parsing: in the test (before calling conf.Parse) call
t.Setenv("HBOXTEST_SEARCH_DRIVER", ""),
t.Setenv("HBOXTEST_SEARCH_MEILISEARCH_HOST", ""),
t.Setenv("HBOXTEST_SEARCH_MEILISEARCH_INDEX", ""), and
t.Setenv("HBOXTEST_SEARCH_MEILISEARCH_MAXHITS", "") (or any other HBOXTEST_*
keys your config reads) so conf.Parse uses the code defaults; keep the rest of
the assertions unchanged.
In `@backend/internal/sys/config/conf.go`:
- Around line 80-81: The Meilisearch config exposes Host and APIKey without
enforcing secure transport; add a validation routine (e.g., ValidateMeiliConfig
or a method Validate on the struct that contains Host and APIKey) that parses
Host as a URL and returns an error if the scheme is not "https" unless the host
is explicitly local (allow "localhost", "127.0.0.1", "[::1]" and their ports);
call this validation during config load/initialization so non-local HTTP
endpoints are rejected and only localhost may use http. Ensure the validation
uses url.Parse, checks u.Scheme and normalized u.Hostname() and returns clear
errors for invalid/missing Host or insecure scheme.
In `@Taskfile.yml`:
- Around line 120-121: The health-check loop line "until curl -sf
http://localhost:7711/health > /dev/null; do sleep 0.5; done" can hang CI
indefinitely; replace it with a bounded wait that times out and fails the task
if Meilisearch doesn't become healthy (for example, implement a retry counter or
elapsed-time check and exit non-zero after a max wait like 30s/60 attempts).
Keep the subsequent test invocation "TEST_MEILISEARCH_URL=...
TEST_MEILISEARCH_KEY=... go test ./internal/data/search/ -v -count=1" unchanged;
ensure the bounded-wait returns success only when the curl succeeds and
otherwise prints a clear error and exits non-zero so the task fails fast.
---
Nitpick comments:
In `@backend/internal/sys/config/conf_search_test.go`:
- Line 30: The test in conf_search_test.go calls assert.Contains(t, string(out),
sentinel) but sentinel must be defined at package scope; declare const sentinel
= redactedValue in the package test file conf_redact_test.go (where other tests
share package config) so conf_search_test.go compiles, and ensure redactedValue
remains a stable, non-plausible marker to avoid matching real secrets.
In `@docs/src/content/docs/en/quick-start/configure/search.mdx`:
- Around line 42-43: Replace the example value for the
HBOX_SEARCH_MEILISEARCH_API_KEY env var so it shows a scoped/search-only key
(e.g., "your_scoped_search_key" or "scoped_search_key") instead of
"your_master_key" in all examples, and add a brief note nearby that the master
key should only be used for setup/admin operations; update every occurrence of
HBOX_SEARCH_MEILISEARCH_API_KEY in the file to use the scoped-key example and
keep master-key usage documented only in an admin/setup note.
In `@Taskfile.yml`:
- Line 118: The docker run command currently publishes Meilisearch on all
interfaces (-p 7711:7700); restrict it to loopback by changing the port mapping
to -p 127.0.0.1:7711:7700 in the docker run line (the entry starting with
"docker run -d --rm --name homebox-meili-test ... getmeili/meilisearch:v1.22")
so the container only binds to localhost for test runs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 46c26ef3-5928-47ff-a613-8df799666f35
⛔ Files ignored due to path filters (11)
backend/app/api/static/docs/docs.gois excluded by!backend/app/api/static/docs/**backend/app/api/static/docs/openapi-3.jsonis excluded by!backend/app/api/static/docs/**backend/app/api/static/docs/openapi-3.yamlis excluded by!backend/app/api/static/docs/**backend/app/api/static/docs/swagger.jsonis excluded by!backend/app/api/static/docs/**backend/app/api/static/docs/swagger.yamlis excluded by!backend/app/api/static/docs/**backend/go.sumis excluded by!**/*.sumbackend/internal/data/ent/external.gois excluded by!backend/internal/data/ent/**docs/public/api/openapi-3.0.jsonis excluded by!docs/public/api/**docs/public/api/openapi-3.0.yamlis excluded by!docs/public/api/**docs/public/api/swagger-2.0.jsonis excluded by!docs/public/api/**docs/public/api/swagger-2.0.yamlis excluded by!docs/public/api/**
📒 Files selected for processing (30)
Taskfile.ymlbackend/app/api/cli_reset_password.gobackend/app/api/handlers/v1/v1_ctrl_entities.gobackend/app/api/main.gobackend/go.modbackend/internal/core/services/main_test.gobackend/internal/core/services/service_items_attachments_test.gobackend/internal/data/repo/main_test.gobackend/internal/data/repo/repo_entities.gobackend/internal/data/repo/repo_item_attachments_test.gobackend/internal/data/repo/repo_items_search_test.gobackend/internal/data/repo/repos_all.gobackend/internal/data/search/database.gobackend/internal/data/search/database_test.gobackend/internal/data/search/meilisearch.gobackend/internal/data/search/meilisearch_test.gobackend/internal/data/search/search.gobackend/internal/data/search/tokenize.gobackend/internal/data/search/tokenize_test.gobackend/internal/sys/config/conf.gobackend/internal/sys/config/conf_search_test.gobackend/pkgs/cgofreesqlite/sqlite.gobackend/pkgs/textutils/normalize.gobackend/pkgs/textutils/normalize_test.godocs/src/content/docs/en/quick-start/configure/database.mdxdocs/src/content/docs/en/quick-start/configure/index.mdxdocs/src/content/docs/en/quick-start/configure/search.mdxfrontend/lib/api/classes/items.tsfrontend/locales/en.jsonfrontend/pages/items.vue
There was a problem hiding this comment.
🧹 Nitpick comments (1)
backend/internal/sys/config/conf.go (1)
100-125: 💤 Low valueWell-implemented secure transport validation addressing the previous review concern.
The loopback allowlist correctly handles common cases. One minor edge case: Go's
url.Parsedoesn't normalize IPv6 addresses, so the full notationhttp://[0:0:0:0:0:0:0:1]:7700bypasses the check sinceu.Hostname()returns0:0:0:0:0:0:0:1rather than::1. This is obscure (requires intentional configuration by admin), but for completeness you could parse loopback IPs properly:🛡️ Optional: normalize IPv6 loopback detection
case "http": - switch strings.ToLower(u.Hostname()) { - case "localhost", "127.0.0.1", "::1": + host := strings.ToLower(u.Hostname()) + if host == "localhost" || host == "127.0.0.1" { return nil - default: + } + if ip := net.ParseIP(host); ip != nil && ip.IsLoopback() { + return nil + } return fmt.Errorf("search.meilisearch.host uses insecure http for non-local host %q: use https", u.Host) - }This uses
net.ParseIP(...).IsLoopback()which handles all IPv6 loopback representations.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/internal/sys/config/conf.go` around lines 100 - 125, The Validate() method of MeilisearchConf uses string comparison on u.Hostname() to detect loopback addresses, but Go's url.Parse does not normalize IPv6 addresses, causing full IPv6 notation like http://[0:0:0:0:0:0:0:1]:7700 to bypass the loopback check. Replace the string-based hostname check in the http case with a proper IP-based check by parsing the hostname using net.ParseIP() and calling IsLoopback() on the resulting net.IP object, which handles all IPv6 and IPv4 loopback representations correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@backend/internal/sys/config/conf.go`:
- Around line 100-125: The Validate() method of MeilisearchConf uses string
comparison on u.Hostname() to detect loopback addresses, but Go's url.Parse does
not normalize IPv6 addresses, causing full IPv6 notation like
http://[0:0:0:0:0:0:0:1]:7700 to bypass the loopback check. Replace the
string-based hostname check in the http case with a proper IP-based check by
parsing the hostname using net.ParseIP() and calling IsLoopback() on the
resulting net.IP object, which handles all IPv6 and IPv4 loopback
representations correctly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5d25a27b-c0ff-4338-8637-3754bad95873
📒 Files selected for processing (6)
Taskfile.ymlbackend/internal/data/search/database.gobackend/internal/data/search/database_test.gobackend/internal/data/search/meilisearch.gobackend/internal/sys/config/conf.gobackend/internal/sys/config/conf_search_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
- backend/internal/sys/config/conf_search_test.go
- Taskfile.yml
- backend/internal/data/search/database_test.go
- backend/internal/data/search/database.go
- backend/internal/data/search/meilisearch.go
# Conflicts: # backend/go.mod # backend/go.sum
Deploying homebox-docs with
|
| Latest commit: |
4dff7eb
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://792f864a.homebox-docs.pages.dev |
| Branch Preview URL: | https://mk-better-search.homebox-docs.pages.dev |
What type of PR is this?
What this PR does / why we need it:
Tokenizes the database search, and make search in general more extensible (and in that regard adds Meillisearch capabilities)
This is entirely a backend change with minimal API and front-end chages, however, it should in theory support an "e-commerce" like search experience if we wanted.
Fixes: #1021
Fixes: #438
Special notes for your reviewer:
Validated multiple times in regular database mode, and Meillisearch mode. Should double check and ensure that there's no cross-tenant issues.